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The sequences of nine different cytokines, growth hormone, and prolactin have been aligned 
and their secondary structure predicted. T*e alignment reveals thai each e,on has a character 
sue ^«-ce pattern shared by all cytokines. The most striking sequence similar^ i 
m e*on 4, where- he residue pa.r Phe-Lcu is conserved in many cytokines. In addition Tere 
are discreet homologous regions between ^specific growth factors, including a ni^re* 
of homology between granulocyte-macrophage colonv-stimulaiing factor (GM-£5FV «nH 
mterleukin J (ILO). The secondary structure analysis predicts that exon 0 S^nL has 
an ancparalle heltx-turn-heli* motif, which is likely to form the central helical ^mc^of" 
four a-hehcal bundle-type structure. Based on the secondary, structure and fhfdTs^ 
bonding pattern; the topological connectivity for a number of cytokines has been predicted. 
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1. INTRODUCTION , 
Cytokines and growth factors a^^Ialton^ 

. ' men V 'active 
hematopoiesis and wound healingfand ^urir^an ' 
immun/e response (Paul, 1989; Ann et al, 1990). 
Included in this group of proteins are the 
mterleukins, 3 the colony-stimulating factors, erythro- 
poietin, growth hormone, and prolactin. These pro- 
teins control a wide range of functions in cells in the 
lymphoid, hemopoietic, and reticuloendothelial sys- 
tems by binding to specific cell-surface receptors on 
target cells. Interestingly, most of the cytokines are 
pleiotropic and have multiple biological activities 
(Metcalf, 1989; Mizel, 1989; Nicola, 1989), and this 
functional redundancy might, suggest that these pro- 
teins also have overlapping sequence and structural 
features. 
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3 Abbreviations used : imcrlcukin I. IL-I; inlcrleukin 2, IL-2; inlcr- 
leulcin 3. IL-J: inlerleukin 4, IU-4; inlcrleukin 5. IL-5; inlcrleukin 
6. 1 1-6; inlcrleukin 7. 1 L-7; inlcrleukin 9. IL-9; inlcrleukin II. IL- 
1 1 : erythropoietin. EPO; granulocylc colony-stimulating factor. 
CCSF; granulocytcmacrophage colony-stimulating factor, CM- 
CSF; cMCF. chicken myclo monocytic growth factor; growth 
hormone, GH; prolactin, PRL; circular dichroism, CO. 
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cytokine receptor super family (Bazan, 1989; D'And- . 
rea et aL f 1989; Goodwin et a!., 1990; Nicola and 
Metcalf, 1991). Sequence homologies among cytokine 
hgands, unlike their receptors, are not apparent 
except that IL-6 shows limited homology with G-CSF 
andcMGF(LeuUeva/. T 19S9). However, the genomic 
organization of these ligands reveals that many of 
them have four exons encoding the mature form of the 
protein, irrespective of the variation in their sequence 
lengths, suggesting the idea that these cytokines might 
be related. 

In an attempt to characterize the homologies and 
other common structural features among the cytokine 
ligands, we have made a detailed sequence and struc- 
tural analysis of these ligands. Our sequence homo- 
logy study reveals a number of discrete highly 
conserved regions that are unique among these 
growth factors. The alignment of secondary structure 
also indicates two contiguous a-helical segments sepa- 
rated by a small loop as a common structural pattern 
shared by all growth factors. Using this structural 
pattern and known disulfide bonds that connect 
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different parts of the molecule, the folding topologies 
for a number of growth factors are predicted. 



2. METHODS 

2.1. Sequence Analysis 

The proteins included for our structural analysis 
are interleukin 2 (IL-2) (Taniguchi et aL, 1983; Kash- 
ima et a/., 1985), IL-3 (Fung ct aL, 1984; Yang et aL, 
1986), IL-4 (Lee era/., 1986; Yokola et at., 1986), II- 
5 (Azuma et aL, 1986; Kinashi et aL, 1986), IL-6 
(Hirano et aL 1986; Van Snick et aL, 1988), IL-7 
(Goodwin et aL, 1989; Lupton et aL, 1990), IL-9 
(Yang et aL, 1989), erythropoietin (Jacobs et aL, 
1985; Lin et aL 1985; McDonald e/< 1986) (EPO), 
granulocyle-colony stimulating factor (Naeata et at., 
1986; Tsuchiya et aL, 1986) (G-CSF), s granulocyie- 
macrophage colony-stimulating factor (GM-CSF) 
(Gough et aL, 1984; Wong et aL, I985X growth hor- 
mone (GH) (DeNoto et aL, 1981 ; Page et aL, 1981), 
chicken myelomonocytic growth factor (CrMGF) 
(Leutz et aL, 1989), and prolactin (PRL) (Barta et aL, 
1981 ; Cooke et < 1981). Both human and murine 
sequences were examined v fbt^theseV proteins. 
Sequences were obtained from ana- 



lyzed for sequence, ho 



''ALICr^bls^ 

homology algorithm . (Intellfgeneacs) f and 
"BESTFIT" (Devereux et aL, 1984) (University of 
Wisconsin GCQ), which employs the local homology 
algorithm of Smith and Waterman. The sequences 
were aligned using gap penalties of two, three, and 
four to. maximize the local homologous regions and 
identify the conserved residues. Using these local 
homologous regions as the starting point, a multiple- 
sequence alignment using all peptides was constructed 
manually for exons 1, 2, 3, and 4 within each protein. 
Homology was determined by maximizing identical 
and nearly identical amino, acids. 



2.2. Secondary Structure Prediction- 

Secondary structure was predicted by a joint 
algorithm using Garnier-Osguthorpe-Robson (Gam- 
ier et aL, 1978), Chou and Fasman (1978), Lim 
(1974), and Cohen et al (1986) methods. Since each 
protein has mouse and human versions, a consensus 
secondary structure was determined using the follow- 
ing strategics. 



First, the conformational parameters (a-helix, 0- 
sheet, 0-turns, and coil) for each residue was calcula- 
ted for both human and murine analogs by GOR 
method. The values were averaged for all the aliened 
residues at each position. The conformational stale 
that showed the highest value at a particular position 
was predicted to be the likely structure for the residue 
at that position. The same procedure- was used to 
predict the secondary structure by the Chou and Fas- 
man (C&F) method. 0-turns and other irregular 
regions were also determined by the pattern-matching 
method developed by Cohen et aL (1986). From these 
predicted structures, the final assignment of secondary 
structure was made using the following rules. 

A region was considered in 0-lurn or irregular 
structure, if two out of three methods predicted p- 
turns or irregular conformation for that Vegion. 

A region was considered in a-helix-for /^sheet), 
if both GOR and C&F methods predicted a-helix (or 
/?-sheet) for that region [at least six (four) consecutive 
residues should be in o-helical C/*-sheet) conforma- 
tion]. Whenever there was an ambiguity in assigning 
a particular structure, Urn's residue distribution 
method (Urn, 1974) was used to determine the most 
probable structure for that region. . . : ■ 



Fi^re l show^ne alignri^m of ali; growth fac?- 
•tors included in our analysis. The most striking fea- 
ture of this alignment is the location of intron/expn 
boundaries for each growth factor. The exon distribu- 
tion for the cytokines, growth hormone, and prolactin 
are given in Table I It is obvious that exon size varies 
depending on the size of the protein molecule. How- 
ever, according to our alignment shown in Fig. 1, 
the exon/intron position for each growth factor falls 
within a narrow range irrespective of its size. The 
strongest alignment of exon boundaries occurs after 
the first exon and before the fourth exon (except HIL- 
6) in each protein. The location of the intron between 
the second and third exons is more variable. The 
iniron/exon positions for IL-9 are not known. Based 
on the sequence alignment, we predict that each boun- 
dary for IL-9 should fall within the range specified by 
other cytokines. 

3.2. Homologies in Each Exon 

Figure I also reveals that each exon has some 
characteristic sequence pattern that is highly conser- 
ved in all the growth factors. The major region of 
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homology between the growth factors occurs in exon 
4, which contains a highly conserved aromatic residue 
followed by Leu (except one). This Phe-Lcu pair is 
within 20 residues (except murine IL-3) from the C- 
termmai end for all the proteins included in our analy- 
sis (region 4 in Fig. I). The aromatic residue is a 



phe for 19 out of 25 cytokines and the remaining six 
cytokines have Tyr at this position. This pair of resi- 
dues is also conserved in all the species of growth 
hormone, prolactin, and other prolactin-related mol- 
ecules, proliferin and somatolactin (Liiuer and 
Nathans, 1 984 : Ono ct aL I990). These two residues 
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Tible 1. Distribution of Exons Among Cytokines and Other Growth Factors' 



Protein 


Exon I 


Exon II 


Exon HI 


hlL-2(l33) 






-20 to 29 


30 to 49 


mlL-2(l33) 






-20 to 43 


44 to 63 


mILO (139) 






-27 to 28 


29 to 42 


h!L^(l29) 






-24 to 21 


22 to 37 


mlL-4(!20) 






-20 to 24 


25 to 40 


hlL-5(U2) 






-22 to 26 


27 to 37 


m!L-5(ll2) 






-21 to 26 


27 to 37 


ML-6(I84) 


-28 to 


-22 


-22 to 42 


J? to 80 


mlL-6(l87) 


-24 to 


-18 


-18 to 44 


45 to 82. 


hIL-7(l52) 


-25 to 


-22 


-21 to 23 


24 to 50 


mIL-7(134) 


-25 to 


-22 


-22 to 23 


24 to 50 


hEPO(!66) 


-27 to 


-23 


-23 to 26 


27 to 55 


mEPO(l66) 


-26 to 


-22 


-22 to 26 


27 to 55 


hG-CSF(!74i 


-30 to 


-17 


-I7lo 38 


39 to 74 


hGM-CSF(H3) 






-26 to 27 


28 to 41 


mGM-CSF(IIS) 






-26 to 27 


25 to 41 


hCH(19l) 


-26 to 


-23 


-23 to 32 


33 to 71 


mGH(!90) 


-26 to 


-23 


-23 to 30 


31 to 70 


hPRL(!99) 


-29 to 


-20 


-l9to 39 


40 to 76 


mPRL(!97) 


-29 to 


-20 • 


-l9to 37 


38 to 74 



Exon IV 



Exon V 



50 to 97 
64 to 112 
43 to 74 
75 to 88* 
38 lo 96 

41 to 91 
38 to 80 
38 to 80 
81 (o 129 
83 to 132 

51 to 94 
95 lo H3 r 
51 to 94 
56 to 115 
56 to 115 
75 lo 123 

42 to 83 
42 to 83 
72 to 125 
71 to 124 
77 to 136 
75 lo 134 



98 lo 133 
113 to 149 
89 to 139 

97 to 129 
92 to 120 
81 to 112 ' 
81 to 112 
130 to 184 
133 to 187 
M4to 152 

95 to 134 
116 to 166 
116 to 166 

124 to 177 
84 to 118* 
84 to 118 

!26to 191 

125 to- 190 
137 to 199 
135 to 197 



* Exons fl, III, IV, and V correspond to exons 1. 2, 3, and 4. respectively, in the text. 

mlL-3 and GM-CSF have one extra exon. 
c hIL-7 has one extra exon which is missing in mIL-7. 



are flanked by two polar residues and, in most cases; 
they are either basic (Arg and Lys) or acidic (Asp 
and GIu) residues (Hg^l).T^ 
pattern around this Phe-Leu is QXpiFi^if&Xtipip 1 
where <f> is a hydrophobic residue, X is arty amino 
acid, and p is a polar residue. The two nonpolar posi- 
tions (<p) are frequently occupied by Leu and, in most 
J cases, the last polar type (p) position is either Arg or 
Lys. 

The second most obvious conservation occurs at 
the beginning of exon 3 (region 3A in Fig. I) and 
again an aromatic residue (Phe, Tyr, and Trp) is pre- 
sent in this motif {<pXXF, Y/W). The first position 4> 
is occupied by a nonpolar residue, preferably a leu- 
cine. This sequence pattern is observed for 18 out of 
25 cytokines. Interestingly, the aromatic residue at the ' 
fourth position is followed by another sequence motif 
QGL (Gln-Gly-Leu) for G-CSF, EPO, GM-CSF, and 
11-5. The mouse sequence of IL-9 has the last two 
residues GL at the corresponding position. Another 
characteristic sequence pattern shared by all growth 
factors in the later part of exon 3 is the occurrence of 
hydrophobic residues in every third or fourth position 
for a I2-residue stretch (region 3B in Fig. I). Though 
this is a common pattern for an a-helical repeat, most 
of the hydrophobic residues observed in this region 



for growth factors arc Leu and He. Regarding the 




many cases, nonpolar residues are also conserved at 
three residues downstream from proline (Fig. 1 ). 

In addition lo these consensus patterns, there are 
also other discrete conserved regions in each exon that 
are unique among growth factors. For instance, most 
of the growth factors start with a stretch of Pro. Ser, 
and Thr. In exon I, many of the nonpolar and polar 
residues occur at similar positions (region I in Fig. 
I). The polar residue are also similar types in many 
of these locations. 



3.3. Homologous Regions Between Specific Growth 
Factors 

Apart from these global conserved regions, we 
also noticed' certain regions that are homologous 
between two specific growth factors. To identify these 
local homologies, we considered both identical as well 
as highly conservative substitutions between two 
sequences. Figure 2 shows these homologous blocks 
for different growth factors. 



Cytokine Sequence Homology 
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The homology found between IL-3 and GM-CSF 
for the entire sequence only with five gaps is very 
striking. If we consider both identical and similar type 
residues, about 50% of human IL-3 homologous to 
human GM-CSF. 

IL-6 shares sequence homology with a number of 
proteins. There are two discrete homologous regions 
between IL-6 and EPO in exon 4. The homology 
between these two growth factors is also observed 
near the N-terminal end of exon: K Our analysis also 
indicates that the disulfide loop (Gvs separated by 5 
residues) observed for IL-6 at the beginning of exon 
2 is probably replaced by a shorter loop (Cys separ- 
ated by three residues) in the case of E!>0. Growth 
hormone also shows some specific homology to IL-6, 
in particular, the conservation of aromatic residues in 
exon 3. These two proteins arc not only similar in 
size but their intron/exon positions arc also in close 



proximity. Another protein that shows specific homo- 
logy to IL-6 is IL-4. The C-terminal end of exon 1 
and part of exon 4 are homologous between IL-6 and 
IL-4. 

Other cytokines also have specific homoloeie* 
For example, part of exon 3 and pan of exon 4 of IL- 
7 are homologous to the corresponding regions in I L- 
4 and IL-5, respectively (Fig. 2). 



3.4. Secondary Structure Analysis 

The existence of discrete homologous regions 
among' growth fac.ors would also imply a similar 
secondary structural pattern for these cytokines The 
secondary structural alignment confirms our pre- 
diction and reveals structural features that arc com- 
mon among these growth factors. 
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The secondary structures Tor the growth factors 
have been predicted using four different algorithms as 
described in the Methods section. We included IL-2 
(Brandhuber et at., 1987) and GH (Abdel-Meguid et 
al. 1987) for which X-ray data are available, in our 
prediction analysis to check the reliability of these 
prediction methods. The secondary structure pre- 
dictions not only indicated a significant amount of a- 
hehcal structures, but it a | so showed some definite B- 
sheet or ambiguous (either a-helical or /J-shcet) struc 
tures for each cytokine, including I L-2 and GH How- 
ever, the X-ray data for IL-2 and GH indicates thai 
the, regions predicted as possible /?-sheets for these 
two proteins are found to be either part of a-helical 
structures or irregular regions. A hiah sequence 
homology around these regions strongly sugeests that 
other.protems may also have similar types or struc- 
tures for these segments. 

Further experimental evidence that other pro- 
teins also belong to a-helical structure family comes 
, o°o m m ^measurements of IL^» (Windsor et al., 

,112' ^i^ n, " Een " ai " ,990 >' EPO (L 3 ' « < 
1986). GM-CSF (Wingfield et «/.. 1988), and G-CSF. 
(Lu et ai, 1989). Based on these experimental data, 
we expect that ajl these proteins are completely devoid 
of /?,sheel structures and further assume that all 
regions predicted as definite /?-sheett share a-helical 
structures. The location ;6f a-neHces predicted: Tor 
each ^rtokine is given.in jTabte llJmH^m 

According to our predictions, exon 4 of all 
cytokmes (exon 5 for IL-3) starts with an irregular ■ 
structure followed by a definitive a-helical seement 
U*ig. i). The highly conserved residue pair Phe-Leu 
forms part of the a-helical segment. This.helix which 
is near the C-lerminal end of the amino acid sequence 



Table II. Prediction ofa-Hclices in Cytokin es 
Pro<ein ~ o-Hdial segments' 



IL-6 



GCSF 

EPO 

IL-4 
IL-7 
1L-S 
IL-9 
IL-3 

GM-CSF 



26-45. 58-72,87-108.115-140. 
144-170 

13-27,45-54.78-95. 103-126. 
. 147-172 . 

5-24.46-55.60-80.91-114. 
136-160 

5-21.48-60. 72-88. 108-124 
10-24.50-70.75-91.121-147 

8- 26.41-54. 64-80. 90-114 
4-19. 45-62. 67-82. 97-1 18 

9- 29.48-65.68-82,104-123 
13-26.32-43.53-67.73-85. 
101-119 



■ The sequence numbers correspond to the human « 
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^lll he, c al Seemen ' a " c * ,okincs inc '"cJed in 
our analysis. Exon J has two major a-helices seoar 

£L \ V"? ' 00P a " erOW ' ,h Actors (F? " 
Exon I also has one major helix for all cytokines 
Surprisingly, the major portion of exon 2 is a random 

ILTand GCSF" ^ lL ' 5 - ,L " 7 - a " d ' L - 9 - Tho ^ 
xon 7 fh , Sh ° W S ° me he,ical ""formation in 

or Js • a,S ° i,ke ' y l ° h3VC irr ^'" structure 
or .his region or „ may not be par. of the core s.ruc- 
lure. because of .he occurrence or two disulfide bonds 
and prolines ,n .his segment. Also, a consensus secon- 
dary structure prediction from the ,hree homologous 
proteins IL-6, G-CSF. and cMGF-precludes the 
presence of c-helix in this region. GM-CSF and EPO 
are .he only proteins that seem to have a definite a- 
hehcal structure in exon 2. It is known from X-ray 
analys.s that IL-2 also has «-helica. structure in exon 
2..Inirons in the peptides are observed K^ccur either 

t e r„?rF " ear ' he P eri Pheryor a-helical 

regions (Fig. 3). 

It could be argued that the sequence pattern 
observed for growth factors are characteristic of a- 
helical-type proteins. To validate the significance of 
these sequence homologies found for cytokines, we 
also analyzed the sequences of other a-helical-type 
proteins, as a negative control, the proteins we stud- 
ied include hemerythrin XWard era/., 1975) cycto- 

«aL 1981). None of these proteins showed thechar- 
ftcIoT" 0 SeqUCTCe Pa " erns ob served for the growth 

i 

3.5. Folding Topologies 

The three-dimensional structures for IL-2 
(Brandhuber « 0 /., 1987) and GH (Abdel-Meguid et 
a/1987) determined by X-ray diffraction show that 
bo.h have len-handed antiparallel Tour a-helical 
bundle-type structures. However, the connectivity 0 r :' 
a-hehces for these two proteins is different. IL-2 has 
the typical up-down-up-down connectivity with helix 
A connected to B. B to C. and C to D (Fig 4 a ) 
The a-helices of GH are joined by up-up-down-down 
connectivity (one short distance connection and two 
ong distance or overhand connections). The lopo- 
hJgical associations of the four a-helices are A to C 
C to B, and B to D (Fig. 4b). 

Crystal structure data analysis or Tour-a-hclical 
type proteins indicates that 12 out or 13 a-helical 
proteins .ncluding IL-2 and GH, adopt all-antiparal- 
lel a-hehcal bundles (Presnell and Cohen 1989) 
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Theoretical energy calculations also suggest that all- 
antiparallel-type Is more stable than the parallel type 
(Chou efa!. t 1988). Hence, it is a reasonable assump- 
tion that all cytokines will adopt the all-antiparallel- 
type structures, there are three possible topological 
connectivities for ^lea-hande^^IKanuparailel bundles^ 
These connectivities are up-down-up-down^ up-up- 
down-down, and up-down-down-up (Fig. 4arc). GH 
and IL-2 represent the first two topoloeies, 
respectively. 

One common feature that is -shared by GH and 
11-2 is that the middle two a-helices of. the four a- 
helical bundles are connected by a short loop and 
are antiparallel to each other. Since our prediction 
analysis also, indicates that exon 3 of.many cytokines 
has two a-helices separated by a short distance, they 
would invariably be antiparallel to each other and 
form the middle two a-helical segments. This would 
eliminate one of the* three> possibilities (up-down- 
down-up), leaving the remaining two topological con- 
nections as the possible candidates (Fig. 4). Using the 
disulfide bond assignments as a further guidance, it . 
should be possible to predict the folding topologies of 
certain proteins; 

3.5. L IL-4 

Human and murine 1L-4 have three disulfide 
bonds (Carr et aL t 1991). While the second S-S bond 



is identical for both species, the locations of the other 
two S-S bonds are different. In human Ib4, the first 
Cys is bonded to the sixth Cys, which is located near 
the C-terminal end. For murine Il>4, the same first 
Cy? is ^ bondedao the sixth Cys located in a loop 
^between hclic^ 
both the species adopt the same topology, then the 
location of these cysteines for both murine and human 
IL-4 should be on the same side. This condition will 
be satisfied only when'helices C and D are parallel to - 
each other and helices A and D are antiparallel (i.e., 
the topology of IL-4 will have the up-up-dbwn-down 
connectivity similar to GH folding). 

3J.2. GM-CSFand IL-3 

GM-CSF is another protein for which disulfide, 
arrangements are known (Lu et a/., 1989; Shanafelt 
and Kastelein, 1989). It has two disulfide bonds and 
both are located in the second part of the molecule. 
The first S-S bond (S-S bond between first and third 
cysteine) connects the helical segment B and the loop 
between helices C and D, while the second S-S bond 
(S-S bond between second and fourth cysteine) con- 
nects the loop between C and D and the C terminal 
end of the D-helix (Fig. 4c). This arangement is pos- 
sible only if helices C and D are. parallel with one 
overhand connection. The orientation of helix A rela- 
tive to helix B is not clear. There arc two potential 
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hcJical segments preceding helix B of GM-CSF (Fig. 
3). If this protein is also all-anliparallel-type, then one 
of these two helices should be parallel to helix B. 
The topological prediction for the latter part of the 
molecule agrees with the; models proposed by Kaush- 
ansky et aL ( 1 990) and Parry et at. ( 1 988). However, 
both models do not represent aJl-antiparallel-type 
structures. IL-3 has one disulfide bond (Clark-Lewis 
et a/., 1987) that connects the putative N-terminus of 
helix A and the putative C-lerminus of helix C, which 
would argue that these two helices are antiparallel to 
each other (Fig. 4g). If GM-CSF is also structurally 
homologous to IL-3, this would further support that 
hehx A of GM-CSF is parallel to helix B of the four 



a-helical bundle. A similar model has been proposed- 
by Lokker et aL (1991) for IL-3 which has thcup-up- 
down-down connectivity, 

3.5J. EPO 

For EPO, there is a disulfide bond between the 
N-termtnal end: arid the C-terminal end (McDonald 
et aL 1986): This would suggest thai these two a- 
hehcal segment's are oriented in opposite directions 
(Fig. 4h). Since the middle two a-helices are also pre- 
dicted to be antiparallel to each other, this protein 
can adopt any one of the two possibilities (either up. 
down-up-down or up-up-down-down folding topol- 
ogy) to satisfy the disulfide bonding connection (Fig. 
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4h). In this regard, Bazan (!990) has proposed the 
latter topology for EPO. 

3.5,4. Other Proteins 

The disulfide assignments for IL-5, IL-7. and IL- 
9 are not known. IL-5 has only two cysteines which 
are located before and after helix B and helix Q 
respectively. If we assume a disulfide bond between 
these two Cys, it would orient helices B and C in 
opposite directions, again supporting our prediction 
that the central two a-hclices of a four ohelical 
bundle are antiparallel. Finally, the loop sizes and the 
a-helices of IL-6, G-CSF, and c-MGF are compar- 
able to GH, which tempt us to speculate that their 
structures are also similar to GH in that they all have' ' 
up-up-down-down connectivity. Though this incon- 
sistent with the model proposed by Bazan (1990), 
more data is necessary to confirm our prediction! 
However, the observation of discrete blocks of 
homologous amino acid sequences shared by many 
cytokines (table II) would strongly suggest that IL- 
4, 

IL-5, IL-6, IL-7, EPCX G-CSF, and c-MGF all have 
similar folding topologies. 



. 4. DISCUSSION 

In this study, we have shown that 'the. sequeh^ 
'of nine different cytokines and growth 4a1&rs^^ 
aligned and their secondary structure predicted^ We 
also examined several other cytokines and growth fac- 
tors whose receptors do not belong to the cytokine 
receptor superfamily (Bazan, 1990; Goodwin et al. t 
1990). For example, we attempted to match the 
sequence of IL-1 (Eisenberg et a/., 1991) with the 
alignment reported here without success, indicating 
that IL-I does not belong to this ligand superfamily. 
Interestingly, IL-I has been shown to be primarily /?. 
sheet in structure (Finzel et o/. f 1989; Graves et aL 
1990). We also examined the sequence of IL-I I (Paui 
et a/., 1990). lis sequence does not match our align- 
ment either and we would conclude that this cytokine 
[does not belong to the ligand superfamily identified ... 
.in this study. As a corollary to this conclusion, our 
work would predict that the receptor for IL-I I would 
not belong to the cytokine receptor superfamily. The 
identification and sequencing of the IL-I I receptor 
will be required to lest our prediction. 

The primary sequence and secondary structure 
analysis of cytokines and other growth factors exam- 
ined in our study suggest that these proteins might 



have evolved from a single primordial gene. The fea- 
tures that are found common at genomic, primary 
and secondary structural levels further support this 
hypothesis. However, the possibility that certain pro- 
teins within this Tamily are likely to adopt different 
folding topologies would also imply at least two sub- 
classes withm this large superfamily. 

The receptors for the cytokines studied here 
belong to a superfamily that also may have "evolved 
from a single primordial gene. It is temptins to specu; 
late upon the hypothesis of a primordial lieand and a 
primordial receptor divergently coevolving into the 
family of ligands and receptors discussed here. A 
detailed comparison of the subfamily pattern within 
both ligand and receptor superfamilies might allow us 
to test the validity of this hypothesis. 

Our analysis also suggests that overlapping 
functions of certain proteins can be attributed to func- 
tional residues found at similar locations duelo simi- 
lar folding of the entire molecule or a'major portion 
of it. In this context, IL-3 and GM-CSF share certain 
biological activities and they can cross-compete for 
binding on some cell lines. For example, the regions 
(18-22, 34-41 , 52-61, and 94-115) of GM-CSF have 
recently been shown to be responsible for its bioloei- 



■•■■^V^^W^WiTOarfLdCUil I I toil 

activity (Lokkere/aA 1*9.1), cve^ 
with the functional region 94- 1 15 of GM-CSF (Figs. 
2 and 3). Subtle differences within these functional 
regions might also dictate the differences in biological 
activity between two growth factors. This could obvi- 
ously be tested by site-specific mutagenesis followed 
by a ligand-binding assay. Recently, this kind of 
approach has been attempted for GH to induce pro- 
lactin activity, and the results have been very success- 
ful (Cunningham et a/., 1990: Cunningham and 
Wells, 1991). Our sequence alignment and similar 
folding patterns inferred from these results should be 
extremely valuable in identifying critical regions 
within other members of this ligand family and in 
modifying a growth factor for making it highly selec- 
tive for a particular function. 
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NOTE ADDED IN PROOF 

The crystal structure of GM-CSF has been pub- 
lished (Diederichs, K., Boone, T. and Karplus, P.A 
( 1 99 1 ) Science, vol. 254, 1 779- 1 782) after the submis- 
sion of this manuscript. The location of a-helices (A- 
13-28, B: 55-64, C: 74-87 and D: 103-116) and their 
up-up-down-down connectivity agree, very well with 
our prediction. 
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